Fast Matrix Multiplication Algorithms on Mimd Architectures
نویسندگان
چکیده
Sequential fast matrix multiplication algorithms of Strassen and Winograd are studied; the complexity bound given by Strassen is improved. These algorithms are parallelized on MIMD distributed memory architectures of ring and torus topologies; a generalization to a hyper-torus is also given. Complexity and efficiency are analyzed and good asymptotic behaviour is proved. These new parallel algorithms are compared with standard algorithms on a 128-processor parallel computer; experiments confirm the theoretical results.
منابع مشابه
Generalizing of a High Performance Parallel Strassen Implementation on Distributed Memory MIMD Architectures
Strassen’s algorithm to multiply two n×n matrices reduces the asymptotic operation count from O(n) of the traditional algorithm to O(n), thus designing efficient parallelizing for this algorithm becomes essential. In this paper, we present our generalizing of a parallel Strassen implementation which obtained a very nice performance on an Intel Paragon: faster 20% for n ≈ 1000 and more than 100%...
متن کاملA New Parallel Matrix Multiplication Method Adapted on Fibonacci Hypercube Structure
The objective of this study was to develop a new optimal parallel algorithm for matrix multiplication which could run on a Fibonacci Hypercube structure. Most of the popular algorithms for parallel matrix multiplication can not run on Fibonacci Hypercube structure, therefore giving a method that can be run on all structures especially Fibonacci Hypercube structure is necessary for parallel matr...
متن کاملSkeletons for Divide and Conquer Algorithms
Algorithmic skeletons intend to simplify parallel programming by providing recurring forms of program structure as predefined components. We present a fully distributed task parallel skeleton for a very general class of divide and conquer algorithms for MIMD machines with distributed memory. This approach is compared to a simple masterworker design. Based on experimental results for different e...
متن کاملAlgorithm - Based Fault - Tolerant Strategies in FaultyHypercube and Star
This dissertation addresses the design of algorithm-based fault-tolerant strategies in faulty hypercube and star graph multicomputers without hardware modi cation. Several new concepts and designs are presented here under the permanent and transient fault models. Under the permanent fault model, we propose a new fault-tolerant recon guration scheme in the faulty hypercube and star graph multico...
متن کاملA Quantitative Code Analysis of Scientific Systolic Programs: DSP vs. Matrix Algorithms
In this paper we consider systolic programs of the most common DSP (convolution, FIR, IIR, FFT) and Matrix (multiplication, triangularisation, linear equation solving, modified Faddeev algorithm) algorithms, executed on systolic arrays of various topologies (linear, 2D mesh, hexagonal). We examine the algorithm-specific parameters (number of I/O paths, unit delays) and program-dependent paramet...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Parallel Algorithms Appl.
دوره 4 شماره
صفحات -
تاریخ انتشار 1994